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ADDRESSING 

THE  TRAVELING  SALESMAN  PROBLEM 
THROUGH  EVOLUTIONARY  ADAPTATION 

By  Davtd  Fogel 

INTRODUCTION 

The  optimization  of  the  traveling  salesman  problem  continues  to 
receive  attention  for  three  reasons:  (1)  Its  solution  is  computationally 
difficult  although  the  algorithm  Itself  Is  easily  expressed;  (2)  It  is 
broadly  applicable  to  a  variety  of  engineering  problems;  and  (3)  it  has 
become  somewhat  of  a  comparison  "'benchmark-  problem.  The  task  is  to 
arrange  a  tour  of  n  cities  such  that  each  city  is  visited  only  once  and  the 
length  of  the  tour  (or  some  other  cost  function)  is  minimized.  For  an  exact 
solution  the  only  known  algorithms  require  the  number  of  steps  to  grow  at 
least  exponentially  with  the  number  of  elements  In  the  problem.  Brute 
force  methods  of  rinding  of  the  shortest  path  by  which  a  traveling 
salesman  can  complete  a  tour  of  n  cities  requires  compiling  a  list  of 
(n-i  )!/2  alternative  tours,  a  number  that  grows  faster  than  any  finite 
power  of  n.  The  task  quickly  becomes  unmanageable. 

r\ 


BACKGROUND 


Two  recent  papers  (Goldberg,  Lingle,  Jr,  1985;  Grefenstette  et  al., 
1985)  addressed  the  traveling  salesman  problem  through  use  of  the 
genetic  algorithm  as  proposed  by  Holland  (1975).  This  algorithm  Is  an 
offshoot  of  the  evolutionary  programming  concept  offered  by  Fogel  (1962, 

1 964,  Fogel  et  ?).,  1966). 

In  Fogel’s  evolutionary  procedures  the  process  of  iterative  mutation 
and  selection  is  simulated  to  evolve  a  logic  most  suitable  for  resolving 
the  problem  at  hand.  Intelligent  behavior  Is  viewed  as  requiring  prediction 
of  an  environment  coupled  with  the  use  of  such  predictions  for  the  sake  of 
controlling  that  environment  (to  the  greatest  extent  possible).  The 
behavior  of  each  artificial  organism  is  constructed  as  a  finite  state 
machine,  a  general  mathematical  function  that  does  not  constrain  the 
represented  transduction  to  be  linear,  passive,  or  without  hysteresis. 

The  evolutionary  process  is  simulated  in  the  following  manner  an 
original  "machine*  (an  arbitrary  logic  or  a  "hint")  is  measured  in  its  ability 
to  predict  each  next  event  in  its  “experience"  with  respect  to  whatever 
payoff  function  has  been  prescribed.  Progeny  are  now  created  through 
random  mutation  of  this  "parent"  machine.  They  are  scored  in  a  similar 
manner  to  the  parent  in  predictive  ability.  If  the  parent  is  better  than  its 
offspring,  the  parent  is  used  to  generate  other  offspring.  If,  however,  an 
offspring  is  better  than  its  parent,  that  offspring  becomes  the  new  parent. 


This  assures  non-regressive  evolution.  An  actual  prediction  is  made  when 
the  predictive  fit  score  demonstrates  that  a  sufficient  level  of  credibility 
has  been  achieved.  The  surviving  machine  generates  a  prediction,  indicates 
the  logic  of  this  prediction  and  becomes  the  progenitor  for  the  next 
sequence  of  progeny,  this  in  preparation  for  the  next  prediction.  Thus, 
randomness  is  selectively  incorporated  into  the  surviving  logic.  The 
sequence  of  predictor  machines  demonstrates  phyletic  learning,  an 
inductive  generation  of  sequences  of  hypotheses  concerning  the  relevant 
regularities  found  within  the  experienced  environment,  in  the  context  of 
the  given  payoff  function. 

Holland's  approach  differs  from  that  of  Fogel’s.  Rather  than  describe 
each  organism  only  in  terms  of  its  behavior,  Holland  emphasizes  the  coding 
structures  which  generate  such  organisms.  Holland's  genetic  algorithms 
search  a  parameter  space  where  "any  point  in  the  parameter  space  can  be 
represented  as  an  n  bit  vector."  "There  are  two  primary  operations  applied 
to  the  population  by  a  genetic  algorithm.  Reproduction  changes  the 
contents  of  the  population  by  adding  copies  of  genotypes  with  above- 
average  figures  of  merit."  "Crossover  is  the  primary  means  of  generating 
plausible  new  genotypes  for  addition  to  the  population"  (Ackley,  1985). 

Holland  defines  crossover  as  taking  two  coding  structures, 

A i  =a  1 1 a 1 2-a i n  anc3  A2  =  a2 1  a22  -a2n*  and  at  a  random  point  *x"  between 
1  and  n,  exchanging  the  set  of  attributes  to  the  right  of  this  position 
yielding  offspring  of  the  form:  A*  =  a.  iai2”aixa2(x+  l)  -a2n- 
'offspring'  is  added  to  the  population,  displacing  some  other  genotype 


according  to  various  criteria  where  It  has  the  opportunity  to  flourish  or 
perish  depending  on  its  fitness.  Mutation  provides  a  chance  for  any  allele 
to  be  changed  to  another  randomly  chosen  value.  If  the  mutation  rate  is  too 
low,  possibly  critical  alleles  missing  from  the  initial  population  will  have 
only  a  small  chance  of  getting.Jnto  the  population.  However,  if  the 
probability  of  a  mutation  is  not  low  enough,  information...will  be  steadily 
lost  to  random  noise"  (Ackley,  1985). 

Holland  likens  the  actual  code  being  mutated  to  that  of  the  genetic 
code  that  defines  a  natural  organism.  While  Fogel  et  al.  (1966)  only  used 
small  degrees  of  "background"  mutation,  Holland  incorporates  the 
operations  of  gene  "crossover"  and  "inversion"  among  other  actual  biologic 
genetic  recombinations.  Although  Holland’s  work  has  gone  largely 
unnoticed  for  some  time,  today  renewed  attention  is  being  given  to  genetic 
algorithms. 

Goldberg  and  Lingle  (1985)  offered  several  )bservations  of  the 
genetic  algorithm  (GA)  as  it  relates  to  the  traveling  salesman  problem: 

“  1 )  Simple  genetic  algorithms  work  well  in  problems  which  can  be 
coded  so  the  underlying  building  blocks  (highly  fit,  short  defining 
length  schemata)  lead  to  improved  performance. 

"2)  There  are  problems  (more  properly  codings  for  problems)  that  are 
GA-hard  —  difficult  for  the  normal  reproduction  ♦  crossover  ♦ 
mutation  processes  of  the  simple  genetic  algorithm. 

"3)  Inversion  is  the  conventional  answer  when  genetic  algorithmists 


are  asked  how  they  Intend  to  find  good  string  ordering,  but  inversion 
has  never  done  much  in  empirical  studies  to  date. 

”4)  Despite  numerous  rumored  attempts,  the  traveling  salesman 
problem  has  not  succumbed  to  genetic  algorithm-like  solution.- 

They  suggested  a  new  type  of  crossover  operator,  the  “partially-mapped 
crossover  (PMX),‘  which  they  believe  will  lead  to  a  more  efficient  solution 
of  the  traveling  salesman  problem. 

PMX  would  proceed  as  follows:  consider  two  possible  codings  of  a  tour 
of  eight  cities,  A|  and  A2.  a  return  to  the  initial  city  being  implicit: 

A,:  3  5  12  7  6  8  4 

A2:  1  0  5  4  3  6  2  7 

Two  positions  are  determined  randomly  along  the  A|  coding.  The  actual 
cities  located  between  these  positions  along  A|  are  exchanged  with  the 
cities  located  between  the  same  positions  along  A2.  For  example,  if  the 
positions  three  and  five  are  chosen,  the  sub-coding  along  A  j  is  1-2-7,  and 
the  sub-coding  along  a2  is  5-4-3.  Each  of  these  cities  is  then  exchanged, 
leading  to  the  new  tours,  A*j  and  A*2: 

A*,:  7  15  4  3  6  8  2 
A*2:  5  8  12  7  6  4  3. 


Goldberg  and  Llngle  (1985)  reported  two  experiments  on  ten  cities 
where  the  PMX  operator  enabled  the  search  to  efficiently  discover  either 
the  absolute  or  near  optimum  solution. 

Grefenstette  et  al.  (1985)  addressed  the  traveling  salesman  problem 
using  Holland’s  "simple  crossover."  This  required  the  formation  of  a 
special  coding  structure  Clearly,  using  this  operator  on  two  valid  tours 
could  result  in  an  "offspring"  that  was  not  a  valid  tour.  As  Dewdney  (1985) 
has  commented,  the  authors'  method  for  devising  the  appropriate  coding 
was  Ingenious. 

‘The  representation  for  a  five-city  tour  such  as  a,  c,  e,  d,  t>  turns  out 
to  be  1 2321.  To  obtain  such  a  numerical  string  reference  Is  made  tc 
some  standard  order  for  the  cities,  say,  a,  b,  c,  d,  e.  Given  a  tour  such 
as  a,  c,  e,  d,  t>,  systematically  remove  cities  from  the  standard  list 
In  the  order  of  the  given  tour  remove  a,  then  ct  e  and  so  on.  As  each 
city  Is  removed  from  the  special  list,  note  its  position  Just  before 
removal,  a  is  first,  ris  second,  e  is  third,  d  is  second  and.  finally  t 
is  first.  Hence  the  chromosome  12321  emerges.  Interestingly,  when 
two  such  chromosomes  are  crossed  over,  the  result  Is  always  a  tour." 
Unfortunately  the  experiments  with  this  representation  were  "not  very 
encouraging*  (Dewdney,  1 985).  Grefenstette  et  al.  conducted  larger 
experiments  than  those  of  Goldberg  and  Llngle,  Including  50,  100  and  200 
cities.  In  the  three  reported  experiments,  after  a  large  number  of  trials 
(approximately  14000, 20000  and  25000,  respectively),  the  best  tours 
were  still  far  away  from  the  expected  optimal  solutions. 


At  this  point  it  is  natural  to  ask  "why?".  After  all,  the  traveling 
salesman  problem  only  requires  discovery  of  a  logical  pattern.  This  seems 
completely  analogous  to  what  occurs  in  nature.  If  the  crossover  of  genes 
works  in  natural  evolution,  why  shouldn’t  it  work  here? 

The  answer  is,  In  fact,  that  suggested  by  Goldberg  and  Lingle’s  second 
observation:  the  traveling  salesman  problem  Is  difficult  to  address  using 
Holland’s  crossover  mutation.  This  is  because  the  crossover  operation,  as 
defined  by  Holland,  does  not  mimic  the  biological  crossover  of  genes. 
Natural  crossover  Is  a  phenomenon  where  "old  linkages  between  genes  on 
homologous  chromosomes  are  broken  and  new  linkages  are  established. 
Genes  that  reside  on  the  same  chromosome  and  move  together  are  said  to 
be  ’linked.’  A  linkage  group  is  any  group  of  genes  physically  linked  on  one 
chromosome...Changes  in  linkage  groups  are  not  truly  mutations,  however, 
since  neither  the  amount  nor  the  function  of  genetic  material  is  altered" 
(Levy.  1982). 

Holland's  crossover  treats  the  entire  tour  as  a  chromosome  and  each 
city  in  a  tour  somewhat  as  a  gene.  While  this  does  not  change  the  amount 
of  coding,  it  greatly  alters  the  function  of  the  coding.  Natural  crossover 
allows  for  different  combinations  of  alleles.  Alleles,  by  definition,  control 
the  same  characteristic  and  occupy  the  same  place  on  similar 
chromosomes.  A  more  appropriate  biologic  interpretation  of  a  tour  would 
be  ‘.hat  it  is  Itself  a  gene.  Crossover  inside  a  gene  is  a  nonsequetor.  The 
tour  is  not  analogous  to  a  chromosome  and  each  city  in  a  tour  is  not 
analogous  to  a  gene.  These  relations  are  in  fact  anomolous. 


The  result  of  Holland's  crossover  Is  therefore  a  near  random  search 
throughout  the  entire  space  of  possible  tours.  This  Is,  of  course,  the 
essence  of  the  difficulty.  Dewdney  (1985)  has  commented  that  by  using 
Holland's  crossover  "there  is  so  much  Juggling  of  genes  and  cracking  of 
chromosomes  that...(a  parent)... Is  hard  put  to  recognize  its  own 
grandchildren."  As  the  number  of  cities  grows  larger,  Holland  s  crossover 
effectively  destroys  the  link  between  each  parent  and  its  offspring.  The 
results  can  even  be  worse  than  a  complete  enumeration  of  all  possible 
tours  (Appendix).  Adaptive  plans  must  retain  previous  advances  and 
incorporate  them  into  future  solutions. 

AN  ALTERNATIVE  APPROACH 

An  alternative  solution  is  the  adaptive  algorithm,  so  named  because  it 
does  not  include  any  of  the  genetic-mimicking  operators  that  Holland  has 
suggested,  but  instead  emphasizes  the  behavioral  appropriateness 
(fitness)  of  the  evolved  trial  solutions.  The  algorithm,  which  is  equivalent 
to  Fogel’s  evolutionary  programming  restricted  to  single  state  machines, 
only  slightly  mutates  the  existing  tour  by  removing  just  one  city  from  a 
given  list  and  replacing  it  In  a  different  randomly  chosen  position.  This 
mutation  is  only  mildly  more  complicated  than  the  simplest  possible 
mutation,  that  is,  swapping  adjacent  cities.  It  Is  clearly  less  complex  than 
either  the  PMX  operator  or  Holland’s  crossover;  through  multiple  mutation, 
this  single  alteration  can  be  made  equivalent  to  either  of  these  crossover 
operators.  Holland  (1975)  has  stated:  "If  successive  populations  are 
produced  by  mutation  alone  (without  (genetic)  reproduction),  the  result  is 


a  random  sequence  of  structures  drawn  from  (all  possible  structures).- 
This  Is  only  partially  correct.  The  adaptive  algorithm  does  result  in  a 
random  search,  but  only  in  that  portion  of  the  space  relatively  close  to  the 
parent  which  generates  the  offspring.  This  dramatically  increases  the 
effectiveness  of  the  search  through  the  state  space  of  all  possible 
constructions. 


Not  only  must  advances  be  retained  but  ’dead-ends"  must  be 
circumvented.  Because  there  is  a  finite  number  of  offspring  that  can  be 
generated  through  mutation  evolutionary  stagnation  might  well  occur  on  a 
local  optimum.  To  prevent  this  it  is  useful  to  randomly  alter  the  adaptive 
topography  (payoff  function)  that  is  being  searched.  This  can  be  accomp¬ 
lished  by  a  variety  of  means.  One  of  these  is  to  occasionally  allow  for  the 
survival  of  offspring  that  are  slightly  worse  than  their  parents.  In  effect, 
the  scoring  function  is  made  "noisy." 


What  results  is  analogous  to  the  searching  of  a  maze;  when  a  dead-end 
is  reached  some  backtracking  is  allowed  and  the  overall  search  is 
reinitiated.  Unfortunately,  the  topography  is  much  like  an  upside-down  bed 
of  nails,  with  some  nails  being  longer  (better)  than  others.  From  any  given 
nail,  it  is  possible  to  travel  to  n(n- 1 )  other  nails  in  a  single  mutation  by 
randomly  choosing  a  city  and  placing  it  in  a  llfterent  position.  Unlike  a 
maze,  when  the  evolving  phyletic  line  reaches  a  non-optimal  nail  from 
which  no  single  mutation  results  in  a  better  tour,  it  is  impossible  to 
determine  the  "direction"  in  which  to  backtrack.  The  complete  prevention 
of  evolutionary  stagnation  is  impossible  unless  all  inheritance  is  given  up 
and  the  search  made  completely  random. 
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Experiments  were  performed  to  determine  the  effectiveness  of  the 
adaptive  algorithm.  Initially,  128  Independent  trials  were  performed  on  a 
24  city  traveling  salesman  problem  where  the  cities  were  positioned  on 
the  periphery  of  a  rectangle.  Clearly,  the  minimum  length  tour  is  equal  to 
the  perimeter  of  the  rectangle.  In  this  case,  250.  The  amount  of  noise  that 
was  used  is  indicated  in  Table  1.  The  same  degree  of  noise  was  used 
throughout  all  of  the  experiments  described.  Of  the  128  trials  performed. 


Less  than  1,500 

<- 

1.05 

<- 

1.1 

<■ 

1.2 

> 

1.2 

Between  1 ,500  and  5,000 

<- 

1.05 

<» 

1.1 

<- 

1.2 

> 

1.2 

Between  5,000  and  1 0,000 

<- 

1.05 

<- 

1.1 

<- 

1.2 

> 

1.2 

Between  1 0,000  and  20,000 

<« 

1.05 

> 

1.05 

Greater  than  20,000 

Any  ratio 

Table  1:  The  amount  of  noise. 
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90.625%  found  the  optimum  solution  In  an  average  of  5297.48  Iterations 
(Figure  1 )  where  the  maximum  number  of  Iterations  was  arbitrarily  set  at 
14,000.  Figure  2  Indicates  the  results  of  the  remaining  9.375%  of  the 
trials  In  which  the  evolving  tours  were,  at  least  temporarily,  trapped  on  a 
local  optimum.  Despite  the  seemingly  non-complex  arrangement  of  cities, 
the  numerous  local  optima  Inherent  to  this  city-structure  make  this 
particular  traveling  salesman  problem  somewhat  recalcitrant. 

To  further  investigate  the  efficiency  of  the  algorithm,  20  experi¬ 
ments  were  conducted  requiring  a  tour  of  50  cities  where  the  cities  were 
redistributed  for  each  experiment.  In  each,  no  optimum  tours  were 
discovered  in  20,000  Iterations,  but  it  was  clear  that  the  evolutionary 
process  was  'solving  the  problem."  Figure  3  indicates  the  results  of  a 
typical  experiment.  Figure  4  indicates  the  mean  and  estimated  two-sigma 
limits  of  the  evolutionary  process  as  it  discovered  more  and  more  suitable 
tours  as  offspring  were  evaluated.  Note  that  "backtracking"  played  an 
Integral  part  of  the  search. 

Experiments  were  then  performed  requiring  a  tour  of  100  cities  under 
similar  conditions.  Again,  while  none  of  the  eight  experiments  found  a 
perfect  tour  In  20,000  Iterations,  the  evolutionary  process  performed 
well.  Figure  5  indicates  the  results  of  a  typical  experiment  while 
Figure  6  indicates  the  mean  and  two-sigma  limit  of  the  reduction  in  tour 
length  as  offspring  were  evaluated. 


Further  experiments  required  a  tour  of  90  cities.  Here,  18  trials 
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were  performed  on  ten  groups  of  nine  cities  that  were  randomly  placed  on 

the  coordinate  grid.  The  process  was  allowed  to  evolve  32,000  offspring. 

# 

While  the  optimum  solution  remained  undiscovered,  it  is  of  interest  to 
note  that  the  problem  was  evidently  addressed  at  two  distinct  levels.  The 
evolutionary  process  initially  solved  the  problem  at  a  gross  level, 
discovering  the  minimum  tour  between  the  groups  of  cities  (Figure  7  and 
Figure  8).  Insufficient  time  was  allowed  to  sort  out  the  problem  at  a  finer 
level  of  detail.  Figure  9  indicates  the  mean  and  estimated  two-sigma 
limits  to  the  reduction  of  tour  length  up  to  the  20,000th  iteration. 

An  extremely  large  traveling  salesman  problem  was  also  analyzed. 
Here,  256  cities  were  randomly  distributed.  Based  on  the  previous  results 
It  was  not  expected  that  the  adaptive  algorithm  would  discover  the 
optimum  solution  in  20,000  Iterations;  however,  after  only  10,000 
iterations  It  had  reduced  the  Initial  tour  length  by  roughly  50  percent. 
Figure  10  Indicates  the  ’surviving"  tour  after  evaluating  10,000  offspring 
while  Figure  1 1  indicates  the  success  of  the  evolutionary  process  in 
discovering  better  and  better  tours.  The  available  computation  time 
limited  the  analysis,  however  the  results  were  certainly  encouraging. 

The  evolutionary  program  was  also  extended  to  allow  for  cities 
distributed  in  three  dimensions.  Here,  50  cities  were  randomly  distributed. 
As  expected,  the  addition  of  the  third  dimension  had  little  effect  on  the 
adaptive  algorithm.  The  initial  tour  length  was  reduced  by  50  percent  in 
fewer  than  6,000  iterations,  see  Figure  12.  Figure  1 3  indicates  two  views 
of  the  surviving  tour  after  evaluating  20,000  offspring. 
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Traveling  Salesman  Problem  —  50  Cities  in  Three  Dimensions 


These  experiments  indicated  the  approximately  exponential  learning 
ability  of  the  adaptive  algorithm.  However,  the  quality  of  the  resulting 

4 

tours  remained  to  be  determined.  This  can  be  assessed  only  when  the 
distribution  of  the  (n- 1  )!/2  tour  lengths  can  be  approximated.  Two 
experiments  were  performed  in  this  regard. 

First,  28  cities  were  positioned  in  a  square  of  perimeter  720  units. 
This,  of  course,  corresponds  to  the  optimum  tour  length.  A  computer 
program  was  written  to  sample  2500  tours  at  random  and  found  an 
estimated  average  tour  length  of  ji  -  838.8163  (an  average  error  of 
1 18.8163)  with  an  estimated  standard  deviation  of  O'-  59.72235.  Because 
of  its  flexibility,  a  gamma  function  was  fitted  to  describe  the  error 
distribution: 

(I)  f(x)  -  (2  x  IO'7Xx3Xe"x/3°),  Xi0. 

Thirty  trials  were  conducted  with  the  adaptive  algorithm  which  yielded  an 
average  tour  length  of  798.613  after  evaluating  20,000  offspring.  The 
average  tour  error  was  approximately  79  units.  Integrating  ( 1 )  from  zero 
to  79  yields  the  estimated  percentage  of  tours  that  were  of  higher  quality 
than  this  average  tour  error.  Here,  this  Integral,  calculated  using 
Simpson’s  rule,  was  approximately  0.257.  Thus,  the  average  tour  error  of 
the  adaptive  algorithm  was  superior  to  roughly  75  percent  of  the  possible 
tours,  this  after  evaluating  only  3.7  x  I0"^4  of  the  state  space.  It  should 
be  noted  that  23  out  of  the  30  trials  found  perfect  tours,  but  because  the 
square  was  large,  locally  optimal  tours  had  great  length  as  compared  to 
the  optimum  tour  length.  This  dramatically  increased  the  average  length  of 
the  30  trials. 


I 
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A  more  complicated  experiment  was  also  performed.  Here,  36  cities 

were  organized  in  four  groups  of  nine  with  a  minimum  length  tour  of  460 
# 

units.  Again,  2500  tours  were  sampled  at  random  and  yielded  an  estimated 
average  tour  length  of  p  •  617.1259  (an  average  error  of  157.1259)  with  a 
estimated  standard  deviation  of  cr  ■  29.43465.  A  gamma  function  was  again 
fitted  to  the  error  distribution: 

(2)  f(x)  -  «5.528  5>(r<28.5))r 1  (x27  5Xe‘x/5  5),  x i  0. 

Thirty  trials  were  conducted  with  the  adaptive  algorithm  which  yielded  an 
average  of  K  -518.7947  for  an  average  error  of  approximately  59  units. 
Integrating  (2)  from  zero  to  59  yields  the  estimated  percentage  of  tours 
that  were  superior  to  the  average  results  of  the  adaptive  algorithm.  This 
was  computed  to  be  0.0000006438,  that  is  to  say,  the  adaptive  algorithm 
produced  tours  that  were  generally  superior  to  over  99.9999  percent  of  all 
possible  tours,  this  after  examining  only  3.87  x  10-36  of  the  entire  tour 
state  space. 


An  additional  set  of  experiments  was  conducted  to  directly  compare 
the  adaptive  algorithm  to  the  PMX  operation.  Here,  100  cities  were 
distributed  at  random  and  30  trials  were  performed  using  both  methods. 
The  cities  were  redistributed  for  each  trial  to  minimize  the  effect  of  an 
unusual  set  of  cities.  Each  algorithm  was  allowed  to  generate  20,000 
offspring.  The  results  were: 

Adaptive  Algorithm  EdX 

x  -  1454.403  units  x  -  4319.455  units 

s-  110.951  units  s-  165.807  units 

where  5?  is  the  average  of  the  thirty  trials  and  s  is  the  standard  deviation 
of  the  sample. 
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As  mentioned  previously,  the  PMX  operation  does  not  retain  sufficient 
information  between  parent  and  offspring  to  perform  effectively. 
Essentially,  the  PMX  operation  is  equivalent  to  swapping  a  random  number 
of  cities  in  a  single  tour.  The  number  of  cities  to  be  swapped  is  equal  to 
the  length  of  the  section  of  the  tour  chosen  at  random.  The  expected 
number  and  variability  of  swaps  per  mutation  are  Indicated  in  Figures  14 
and  15.  In  relatively  small  problems,  on  the  order  of  ten  cities,  the  PMX 
operation  averages  about  three  swaps  with  minimal  variance.  However,  In 
larger  problems,  such  as  the  100  city  problem  performed  here,  the  PMX 
averages  more  than  33  swaps  per  mutation  with  a  rather  high  variance. 
This  prevents  the  required  link  between  generations. 

OrmJSlQNS 

Suv  essful  adaptation  does  not  require  sophisticated  mutations.  In  an 
evolutk  ary  scheme  only  the  ’behavior"  of  a  coding  structure  is  scored; 
the  code  hself  is  never  scored.  The  bottom-up  view  that  emphasizes 
mutation  operations  as  the  key  to  adaptive  plans  is  incorrect.  Competition 
occurs  not  between  coding  structures  but  between  expressed  behaviors. 
The  particular  structure  of  the  code  is  generally  unimportant. 

Further,  sophisticated  mutation  operations  can  be  detrimental.  For 
adaptation  to  succeed,  a  sufficient  link  between  parent  and  offspring  must 
be  maintained.  When  this  link  is  destroyed  the  results  can  be  worse  than  a 
random  search  of  all  possible  coding  structures.  Since  the  traveling 
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salesman's  tour  Is  not  analogous  to  an  organism's  chromosome,  operations 
of  the  form  of  Holland's  crossover  are  unsuitable.  In  general,  little 
emphasis  should  be  placed  on  specific  mutation  operations.  Modeling  a 
given  mutation  operation  In  order  to  elicit  appropriate  behavior  is  much 
like  requiring  an  airplane  to  possess  feathers  In  order  to  fly.  It  Is  the 
mutation  and  selection  of  behavior  that  Is  Imporatant,  not  the  structure  of 
the  code. 

Search  by  adaptive  methods  must  avoid  stagnation  in  local  optima. 
Stagnation  can  be  prevented  through  the  use  of  a  noisy  payoff  function. 
This  concept  is  similar  to  that  suggested  by  Kirkpatrick  et  al.  (1983)  for 
optimizing  simulated  annealing,  but  it  Is  not  necessary  to  resort  to  such 
specific  analogies.  In  a  dynamic  environment,  the  rewards  and  penalties 
for  different  behaviors  vary.  The  search  for  better  and  better  solutions  Is 
everlasting.  Evolution  is  a  continuing  process  with  no  truly  optimum 
solution.  Incorporating  noise  into  the  adaptive  algorithm  prevents 
stagnation  of  the  evolving  phyletlc  line. 

Clearly,  evolutionary  adaptation  can  effectively  address  the  traveling 
salesman  problem.  The  experiments  described  here  Indicate  the  efficiency 
of  this  evolutionary  search.  But,  in  any  given  problem,  there  Is  no 
guarantee  that  the  optimum  solution  will  ever  be  found.  Evolution 
discovers  only  what  it  is  capable  of  discovering.  New  solutions  that  are 
superior  to  old  ones  tend  to  survive.  Despite  this,  the  adaptive  algorithm,  a 
reification  of  natural  evolution,  tends  to  discover  exceedingly  appropriate 
behavior  in  the  context  of  a  given  criteria. 
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A  completely  random  search  (with  replacement)  will  take  roughly 
twice  as  long  to  find  the  optimum  solution  as  an  enumeratlve  search 
(without  replacement).  To  show  this,  consider  the  following  two  theorems: 

Theorem  1:  If  there  are  B  possible  solutions  and  only  one 
optimum  solution,  the  expected  number  of  trials  that  must  be 
made  before  the  optimum  solution  Is  found,  using  an  enumeratlve 
search,  assuming  one  trial  Is  made  at  a  time,  is  equal  to 
(B*  1  )/2. 

Proof:  In  an  enumeratlve  search,  sampling  is  made  without  replacement. 
The  probability,  therefore,  of  discovering  the  optimum  solution  on  any 
given  trial  is  equal  to  the  product  of  the  probabilities  of  not  discovering 
the  optimum  solution  on  any  prior  trial  multiplied  by  the  reclprlcal  of  the 
number  of  untried  solutions.  The  expected  number  of  trials  that  would  have 
to  be  examined  before  finding  the  optimum  solution  would  therefore  be: 

I  x  f(x)  -  M3-1  ♦  2l(B-1)/B](B-ir!  ♦  3T(B- 1  )/B)  l(B-2)/(B- 1 )]  (B-2)"1 
♦  -  ♦  (B-1)  [(B-1)/B]  12/3H1/2J  ♦  B  KB-D/BH2/3H1/2]  1 
-  I  B-1  ♦  213''  ♦  3  fl”1  ♦  -  ♦  (B-I)  fl’1  ♦  B-B’1 
-BH  (1  ♦  2  ♦  3  ♦  ♦  (B-l)  ♦  B) 

*  B_,(B03*  1  )/2] 

*  (13*  l)/2.  Q.E.D. 
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Theorem  2:  If  there  are  S  possible  solutions  and  only  one 
optimum  solution,  the  expected  number  of  trials  that  must  be 
made  before  the  optimum  solution  Is  found.  In  a  completely 
random  search,  assuming  one  trial  Is  made  at  a  time.  Is  equal  to 
D. 

Proof:  In  a  completely  random  search,  sampling  Is  made  with 
replacement.  The  probability,  therefore,  of  discovering  the  optimum 
solution  on  any  given  trial  Is  equal  to  the  product  of  the  probabilities  of 
not  discovering  the  optimum  solution  on  any  previous  trial  multiplied  by 
the  recfprical  of  the  total  number  of  possible  solutions.  The  expected 
number  of  trials  that  would  have  to  be  examined  before  finding  the  optimal 
solution  would  therefore  be: 

2  xT(x) »  I  D’1  ♦  21(0-1  )/mm'  ♦  31(13- 1  )/li?  ♦  ••• 

-  fl"1  (1  ♦  21(13- D/D]  ♦  31(0- 1  )/0p  ♦  ~) 

-D*1 11/(1 -(D-D/D))]2 


0. 


Q.E.D. 


